Skip to content

Remove pandas and datasets from core dependencies #8274

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jun 18, 2025

Conversation

TomeHirata
Copy link
Collaborator

@TomeHirata TomeHirata commented May 26, 2025

Since DSPy 3.0, DSPy does not include dependencies for optimizers such as pandas, or datasets as its core dependencies. This is for making DSPy production-ready, allowing users to use DSPy programs without installing large dependencies during inference time. This PR reduces the file volume size of DSPy dependencies from 284 MB to 115MB.

Copy link
Collaborator

@chenmoneygithub chenmoneygithub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left 2 minor comments, otherwise LGTM!

@@ -66,6 +68,9 @@ dev = [
]
test_extras = [
"mcp; python_version >= '3.10'",
"datasets>=2.14.6",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

open for discussion - do you think it's better to have these go to dev or test_extras? My first impression is they belong to dev better, but not very sure

Copy link
Collaborator Author

@TomeHirata TomeHirata May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think test_extras should be a combination of all extra dependencies that are installed when running pytest. To my understanding, dev usually only contains packages that help development such as linter, test libraries. So we could include them in dev, but including them in test_extras is a must

@TomeHirata TomeHirata changed the title Move pandas, datasets and optuna as optional deps Remove pandas and datasets from core dependencies Jun 18, 2025
@TomeHirata TomeHirata force-pushed the refactor/lite/dependencies branch from 96d1500 to d887229 Compare June 18, 2025 07:23
@TomeHirata TomeHirata merged commit 440101d into stanfordnlp:main Jun 18, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants